Towards a streaming SQL standard

نویسندگان

Namit Jain

Shailendra Mishra

Anand Srinivasan

Johannes Gehrke

Jennifer Widom

Hari Balakrishnan

Ugur Çetintemel

Mitch Cherniack

Richard Tibbetts

Stanley B. Zdonik

چکیده

This paper describes a unification of two different SQL extensions for streams and its associated semantics. We use the data models from Oracle and StreamBase as our examples. Oracle uses a time-based execution model while StreamBase uses a tuple-based execution model. Time-based execution provides a way to model simultaneity while tuple-based execution provides a way to react to primitive events as soon as they are seen by the system. The result is a new model that gives the user control over the granularity at which one can express simultaneity. Of course, it is possible to ignore simultaneity altogether. The proposed model captures ordering and simultaneity through partial orders on batches of tuples. The batching and the ordering are encapsulated in and can be modified by means of a powerful new operator that we call SPREAD. This paper describes the semantics of SPREAD and gives several examples of its use.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Data Management with Distributed Streaming SQL

To stay competitive in today’s data driven economy, enterprises large and small are turning to stream processing platforms to process high volume, high velocity, and diverse streams of data (fast data) as they arrive. Low-level programming models provided by the popular systems of today suffer from lack of responsiveness to change: enhancements require code changes with attendant large turn-aro...

متن کامل

OBDA for Temporal Querying and Streams

Data changes worldwide in size and over time and when new data arrives rapidly from different sources, an easy access to dynamic data becomes a keyfactor. Therefore, temporalizing and streamifying ontology-based data access (OBDA) is a very important topic today, where the industry still relies on algebraic queries. We contribute to the practical efforts in this field by showing how a specific ...

متن کامل

A Generic Solution to Integrate SQL and Analytics for Big Data

There is a need to integrate SQL processing with more advanced machine learning (ML) analytics to drive actionable insights from large volumes of data. As a first step towards this integration, we study how to efficiently connect big SQL systems (either MPP databases or new-generation SQL-on-Hadoop systems) with distributed big ML systems. We identify two important challenges to address in the ...

متن کامل

Optimizations Enabled by Relational Data Model View to Querying Data Streams

We postulate that the popularity and efficiency of SQL for querying relational databases makes the language a viable solution to retrieving data from data streams. In response, we have developed a system, dQUOB, that uses SQL queries to extract data from streaming data in real time. The high performance needs of applications such as scientific visualization motivates our search for optimization...

متن کامل